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Abstract 


Heart disease is a major cause for mortality today. Heart auscultation (the monitoring of 
sounds produced by heart), is a simple tool in the diagnosis of heart diseases, especially valvular 
diseases. It is particularly important in primary health care, due to its effectiveness in detecting 
a wide range of heart abnormalities, and to the low cost of the equipment involved. However, 
forming a diagnosis based on heart sounds is a skill that can take years to acquire. Particularly 
in remote areas and in developing country like ours, physicians with the necessary training may 
not be widely available. 

This thesis presents a diagnosis system based on heart auscultation. A library of heart sound 
files, recorded via an electronic stethoscope are used, features from these samples are extracted 
using discrete wavelet transform and the classification is carried out by using a feed forward 
neural network. The performance of the system was satisfactory considering the paucity of data 
available for training. 
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Chapter 1 
Introduction 


According to the World Health Organization (WHO) heart disease and stroke kill some 17 mil- 
lion people a year, which is almost one-third of all deaths globally. By 2020, heart disease and 
stroke will become the leading cause of both death and disabihty worldwide. 

“No matter what advances there are in high-technology medicine, the fundamental message is 
that any major reduction in deaths and disability from heart disease and stroke will come ‘primar- 
ily from prevention, not just cure. We believe that early cardiac physical examination (screening) 
will become a fundamental element of any prevention program. ” — Dr Judith Mackay, co-author 
of the WHO Atlas of heart disease and stroke. 

So, it is very clear that proper diagnosis of heart disease is important for patients to survive. 
Physicians have to know the condition of the heart to decide for surgery or non-invasive treatment. 
Though electrocardiogram (EGG) is an important tool for diagnosis, it has some drawbacks like : 

• it can detect diseases that are more or less related to blood-circulation and blood vessels, but 
there are heart diseases (structural abnormalities in heart valves and defects characterized 
by heart murmurs) that are difficult to detect using ECGs. 

• cost of EGG equipment is high. 

. • limited availability of EGG equipment. 

• special skill required to administer and interpret the results of EGG. 

The problem is similar with the recently developed echocardiography, as it is bulky and expensive. 
Thus, in remote areas or in developing countries, auscultation (diagnosis through heart sounds) 
seems a feasible alternative. And it would be better if we can use both echocardiograph or EGG 
and auscultation to achieve even better diagnosis. 
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It is worthwhile to mention that, historically, the bare ear and the stethoscope were of great help 
in diagnosing most heart diseases, but it has been somewhat eclipsed in the research literature 
due to the advent of electrocardiographic methods. However, forming a diagnosis based on 
sounds heard through either a conventional acoustic stethoscope or an electronic one is itself a 
very special skill, and it may take years to acquire. Despite its obvious utility, because this skill is 
also very difficult to teach and grasp in a structured way, the majority of medicine and cardiology 
programs offer no such instruction. It would be very useful if the benefits of auscultation could 
be obtained with a simpler method, and using low-cost, easy to use equipment. In this thesis we 
report the design and testing of a digital auscultation system which can be used for heart sound 
based diagnosis. In addition, the system can be used for training medical students since heart 
sounds can both be heard and seen on the screen. 

1.1 Heart Sounds 

Heart sounds are complex and highly nonstationary signals. The “beats” associated with these 
sounds are reflected in the signal by periods of relatively high activity, alternating with compar- 
atively long intervals of low activity. The heart beat usually has two soimd components: Lub 
and Dub. Lub is the first sound and Dub is the second sound. They are also referred to as SI 
and S2. These sounds follow each other in a cychc fashion. 

In addition to Si and S2, which are always present, third and fourth heart sotmds (S3 and S4) 
may also be heard. If present, S3 occurs shortly after S2. When S4 is audible, it occurs shortly 
before Si. 


1.1.1 Sources of Heart Sounds 

While the pathological origins of all contributions to these sounds (Si, S2 etc) axe not agreed 
upon, it is clear that the closure of the heart valves are the major contributor. 

Heeirt Valves 

Blood is pumped through the heart in only one direction. Heart valves play key roles in this 
one-way blood flow, opening and closing with each heartbeat. Pressure changes behind and in 
front of the valves allow them to open their flap-like “doors” at just the right time, then close 
them tightly to prevent a backflow of blood. There are 4 valves in the heart: 

• Tricuspid valve 

• Pulmonary valve 

• Mitral valve 
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• Aortic valve 


How Normal Heart Valves Work 

The heart is divided into four chambers. The upper chambers are called atria and the lower 
chambers are called ventricles. Each heart cycle has two phases. In the 1st phase called systole, 
blood without oxygen returns from the body and flows into the heart’s upper-right chamber (the 
right atrium). Prom there, it is forced through the tricuspid valve into the lower-right chamber 
(the right ventricle). The right ventricle pumps the blood through the pulmonary valve and into 
the lungs. While in the lungs, the blood picks up oxygen. As the right ventricle is preparing to 
push blood through the pulmonary valve, the tricuspid valve closes to stop blood from flowing 
back into the right atrium. 



Figure 1.1: Two phases of a typical heart cycle : systole and diastole 


In the next phase, diastole, oxygen-rich blood returning from the lungs flows into the upper- 
left chamber (the left atrium). This blood is forced through the mitral valve into the lower-left 
chamber (the left ventricle) with the mitral valve sealing off to stop the backflow of blood. At the 
same time that the right ventricle is pumping the blood without oxygen into the lungs, the left 
ventricle is pushing the blood with oxygen through the aortic valve and on to all of the body’s 
organs. 

1.1.2 Heart Valve Problems 

Valve disease occurs when a valve does not work the way it should. If a valve does not open 
all the way, less blood can move through the smaller opening. If a valve does not close tightly. 
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blood may leak backward. These problems may mean the heart has to work harder to pump the 
same amount of blood. Or blood may back up in the lungs or body because it is not moving 
efficiently through the heart. 

• closing problem : insufficiency (also called I'egurgitation) results when the valve doesn’t 
close tightly. The valve s supportive structures may be loose or torn. Or the valve itself 
may have stretched or thinned. Blood then maj^ leak back in the wrong direction through 
the valve. 

• opening problem : stenosis occurs when a valve does not open completely. The valve 
may have become hardened or stiff with calcium deposits or scarring, so it is hard to push 
open. Blood has to flow through a smaller opening, so less blood gets through the valve 
into the next chamber or into the body. 

In our thesis, we will mainly deal with these two kind of problems, namely regurgitation (or 
insufficiency) and stenosis of the 4 heart valves. 

1.1.3 Heart Murmurs 

A heart murmur is a swishing or a whistling sound that the doctor hears when he listens to a 
patient’s heart. The doctor uses a tool called a stethoscope to listen to the heart. 

A murmur is usually present when there is a heart valve problem. The doctor will perform a 
variety of tests to determine what kind of valve problem one has and whether the valve problem 
is serious. Some of the tests performed are: an echocardiogTam, an electrocardiogram, a chest 
x-ray. 


1.1.4 Auscultation 

Auscultation is that part of the physical examination involving the act of listening with a stetho- 
scope to sounds made by the heart and interpretation of the sounds. It is a technique that 
doctors have used since long. Generally, the stethoscope is moved over some specifle areas in the 
chest to have the proper auscultation. The following figure^ shows different auscultation areas. 

1.2 Phono cardiogram 

Phonocardiogram is the graphical representation of the sounds produced by the heart. It helps to 
visualize the different sound components present in a heart cycle and the time of their occurrence. 
Here is a snapshot of a phonocardiogram displayed by our system. 

^courtesy : www.Sm.com 
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Figure 1.3: Phonocaxdiogram 

1.3 Problem Definition and Approach 

This thesis describes the development of an auscultation system. First we have to represent 
the heart sound (or auscultation) graphically, better known as phonocardiogram. This helps the 
doctor to hear the sound and visualize it simultaneously. The second part is concerned with a di- 
agnosis system for valvular irregularities. This requires feature extraction and selection from the 
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phonocardiogram and then a classification algorithm to classify the irregular phonocardiograms. 
The irregularities we deal with are : 

• Aortic Regurgitation 

• Aortic Stenosis 

• Mitral Regurgitation 

• Mitral Stenosis 

These diseases are all related to heart valves. 

In our work, we use local signal analysis methods ( wavelet transform ) and classification 
techniques ( neural network ) to characterize and interpret sounds corresponding to symptoms 
important for diagnosis. It is hoped that the results of this analysis may prove valuable in itself 
as a diagnostic aid, and as input to more sophisticated machine diagnosis systems. 

1.4 Organization of the Report 

The report is organized as follows: 

• In Chapter 2 we review existing work in this field, and then indicate our contribution in 
this area. 

• In Chapter 3 we give an introduction to an intelligent heart sound classification system 
and discuss the different components of this system. These include feature extraction tool 
and the classification tool. 

• In Chapter 4 we describe our system in detail. The different aspects of the implementation 
is covered here. We also discuss the results obtained using our system. 

• Finally, Chapter 5 concludes and mentions possible extensions to this work. 
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Chapter 2 
Related Work 


When studying the research literature, we found some research on the use of phonocardiogram 
to diagnosis heart diseases. We also studied the work done on Wavelet Transform and Neural 
Network for feature extraction and classification of signals, because these are most relevant to 
our work. In the following section, the relevant literature is reviewed. 

2.1 Previous Work 

Ghosh, Deuser and Beck proposed a neural network based hybrid system for detection, charac- 
terization and classification of short-duration oceanic signals [6]. These signals are underwater 
signals obtained from passive sonar and contain valuable clues for source identification. After 
preprocessing (denoising) the signal they used multiresolution wavelets for finding the feature 
vector. Then three classifiers namely statistical classifier (kNN), ANN and recurrent network 
were used to classify those signals. Finally, the outputs of different classifiers were combined. In 
their experiment they obtained almost 100% accuracy. 

Nguyen, Hammel and Gong used a system of multiple hybrid neural network [19] to classify 
contact signals recorded in open ocean sites. They used self-organizing feature map and neural 
network for classification. They used 1800 samples for training and 900 samples for testing. In 
this, they obtained almost 100% accuracy for classification. 

Hippenstiel and Fargues used wavelets [2] to extract features from digital communication 
signals. We have studied the above works because, the signal they are dealing with, is similar in 
nature to phonocardiogram signals, in terms of transient behaviour. 

In the late 80’s Abdalla S. A. Mohamed and Hazem M. Raafat at the University of Regina, 
Canada did some work on recognition of heart sounds and murmurs for cardiac diagnosis [12, 13]. 
They developed a mathematical model to describe' the heart sounds and murmurs by a finite 
number of parameters. An autoregressive model was selected to represent the heart sound at 
principal locations of cardiac auscultation and for different heart diseases. Feature extraction of 
the signals, based on fourth order Unear prediction of the cardiac cycle frames was performed. 
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Then based on the mininiuin distance between the features of the measured pattern and reference 
patterns, classification was carried out. 

With the advent of electrocardiogram (ECG) the focus shifted to ECGs. People tried to 
classify ECGs and predict the heart diseases accordingly. J. Zhu, N. Hazarika, A. C. Tsoi and 
A. A. Sergejew have worked [14, 15, 1] on this topic. They selected three types of ECG signals: 
Normal, Schizophrenia (SCH) and Obsessive Compulsive Disorder (OCD). Wavelet transform 
was used for feature extraction and a three-layered feedforward network which implements the 
backpropagation algorithm was used for learning. The system was able to classify over 66% of 
the normal class and 71% of schizophrenia class of ECG’s. 

Though the research on diagnosis of heart disease using heart sounds, was somewhat eclipsed 
by ECGs, work on heart sounds still continued. 

In 1997, Huiying, Sakari and liro developed a heart sound segmentation algorithm [8]. The 
algorithm separates the heart sound signal in four parts: the first heart sound, the systoUc period, 
the second heart sound and the diastolic period. They used digital heart sounds recorded on 
a multimedia PC equipped with an electronic stethoscope. First phonocardiograms of these 
heart sounds are created. Then using multi-level wavelet decomposition and reconstruction, the 
detail and approximation components of the phonocardiograms were extracted. They marked 
the locations where the signal value exceeded some selected threshold. Thus, they identified the 
Sis and S2s and based on that they segmented the signals. The performance of the algorithm 
was evaluated using 1165 cardiac periods from 77 digital phonocardiographic recording including 
normal and abnormal heart sounds. It showed over 93 percent accuracy. 

Later in 2001, Lee, Kim and Hong proposed some methods for heart sound recognition [9]. 
They used three recognition techniques and compared the result. The first method recognizes the 
characteristics of heart sound by integrating important peaks and analyzing statistical variables 
in the time domain. The second method builds a database by principal component analysis 
on a set of training heart sounds in time domain. Later, this database is used for recognizing 
new heart sounds. The third method builds a similar database, but in time-frequency domain. 
They tried to classify the heart sounds into seven classes. It was noticed that the third method 
outperformed the others. 

Finally, some works closely related to our thesis are as follows. M. S. Obaidat and M. M. 
Matalgah studied the performance of the short-time Fourier transform and wavelet transform 
to phonocardiogram(PCG) signal analysis [18]. They compared the performance of FT, STFT 
and wavelet transform. They found that wavelet transform is capable of detecting the two 
components, aortic valve component A.2 and pulmonary valve component P2, of the second 
heart sound (S2) of a normal PCG signal which can not be detected by FT and STFT. 
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In 2001, Todd R. Reed, Nancy E. Reed and Peter Pritzson worked on analysis of heart sounds 
for symptom detection [17]. First, heart sounds were segmented (manually) into sample segments, 
each consisting of a single heartbeat cycle. Then each segment was transformed using wavelet 
decomposition, based on Coifman 4th order wavelet. The transformed vector was reduced to 
smaller vector size, by discarding levels with shortest scale. Finally, each vector was classified 
using a three layer neural network. The system was evaluated using heart sounds corresponding 
to five different conditions. It gave 100% accuracy for all heart sounds. 

In the year 2002, Ibrahim Turkoglu, Ahmet Arslan and Ilkay Erdogan presented an intelligent 
pattern recognition system to diagnose mitral valve diseases [3] using Doppler signals. The 
Doppler signals of the mitral valve were obtained by placing a transducer over the chest of the 
patient with the aid of ultrasonic image. Wavelet packet decomposition was used to extract the 
featmres, then classification was carried out using a neural network. The performance of the 
system was evaluated on 105 samples that contained 39 normal and 66 abnormal subjects. The 
accuracy obtained was almost 94% for normal and abnormal subjects. 

Onsy Abdel- Alim, Naddar Hamdy and Mohammed A. E. developed a system for heart disease 
diagnosis using heart sounds [16]. A 1-minute long record of heartbeats (obtained from faculty 
of medicine, Ain Shams University, Egypt) is used to extract features. They chose two sets of 
features. The first set consists of features like duration of first and second heart sounds, their 
ratio and difference etc. The second set of features was obtained by using Daubechies wavelet 
transform. The location of the stethoscope on the chest during recording was also considered as 
an additional featmre. A feed forward neural network was used for classification. They used 650 
cases for training the network and 200 cases for testing. A recognition rate of 95% was obtained 
in this case. 

In 2003, M. L. Jacobson did his work on analysis and classification of physiological signals 
[7]. In particular he worked on heart rate variability (HRV) signals, which have been shown to 
contain diagnostic information on the condition of a patient’s cardiac and circulatory system. 
He used wavelet transform decomposition (12th order Daubechies wavelet) as a means of signal 
characterization. Classification was implemented through cluster assignment based on Euclidean 
distance. The class centers were computed as the average of the known patient conditions. The 
unknown patient condition was classified to the class whose center was closest. He used only two 
iiseases namely coronary heart disease (CHD) and diabetes mellitus (DM) in his work and the 
iccuracy obtained was 100% and 97% for DM and CHD respectively. 
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2.2 Our Contribution 


We have developed a system for display and diagnosis of heart auscultation. It helps to visualize 
the heart signals graphically. There is also provision for listening to the sound, where a cursor 
moves over the display in synchrony with the audio. This helps the users (mainly doctors) in 
better identifying the different sound components. Our system also helps to diagnose heart 
diseases, by analysing and classifying the heart sound signals. 

In the previous section, it was found that some work was done on analysis and classification 
of heart sound signals. But, we noticed that most of the work skipped the job of finding a 
heart-beat cycle from a heart sound signal of multiple cycles. Some of them started their work 
on already extracted cycles, and the others extracted the cycles manually. We have developed a 
technique that can extract a cycle from a longer he^t sound signal. We also noticed that due to 
improper recording environment (not studio-like), lots of noise was added to the sound samples. 
So, we used some data-smoothing techniques to filter out noise (to some extent). 

It is well-established that the wavelet transform is most suitable for the analysis of transient 
signals like heart sounds. So, we use Daubechies wavelets to extract features. We compare the 
results given by D4, D6 and D8 wavelets and choose the best one. Then using a back-propagation 
neural network we classify the heart sound signals into different categories. 
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Chapter 3 

Structure of Classification System 


Classification is the process of assigning a label to an unknown pattern so that it is categorized 
into one of several known categories. In our work, we wish to classify heart sound samples into 
(probable) heart-disease categories. In this chapter, the theoretical foundations for the pattern 
recognition and classification system used in our study are discussed. 

The following figures represents the block diagrams of a typical classification system, which 
consists of training and testing module. 


c 


Heart Sound 



Feature Extractor 

-> 

Feature Selector 

-> 

Training Module 


. Trained 
Network 


Figure 3.1: Block diagram of the training module 



Label 


Trained Network 


Figure 3.2: Block diagram of the testing module 


3.1 Data Acquisition and Pre-processing 

3.1.1 Data Acquisition 

The first step is data acquisition. Data (or samples), that are to be used by the system, are 
collected. An electronic stethoscope is used to record the heart beats of different patients and 
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these recordings are then used for training the classifier. 

The raw data collected may not always be usable by the system. There can be noise in the 
data, some data values may be missing etc, so the data is preprocessed. 

3.1.2 Data Cleaning 

Although most classification algorithms have some mechanisms for handling noisy or missing 
data, this step can help reduce confusion during learning. 

Generally, noise removal is done by using filtering techniques or through data smoothing tech- 
niques. Missing values are replaced with the most commonly occurring value for that attribute 
or with the most probable value based on statistics. 

3.1.3 Normalization 

Data should also be normalized, particularly when neural network or methods involving distance 
measurement are used in the learning step. Normalization involves scaling all values for a given 
attribute so that they fall within a small specified range, such as -1.0 to 1.0, or 0.0 to 1.0. 
In methods that use distance measurements, for example, this would prevent attributes with 
initially large ranges from outweighing attributes with initially smaller ranges (such as binary 
attributes). 

3.2 Feature Extraction 

This is arguably the most important component of designing the classification system, since even 
the best classifier will perform poorly if the features are not chosen well. A feature extractor 
should reduce the pattern vector (i.e. the original waveform) to a lower dimension, which contains 
most of the useful information from the original vector. Thus it converts patterns to features in 
a condensed representation. Ideally, it should give only "relevant or important” information. 

We briefly discuss 3 feature extraction techniques that are widely used for time- varying sig- 
nals. These are Fourier Transform, Short-Time Fourier Transform and Wavelet Transform. 

3.2.1 Fourier Transform 

Joseph Fourier showed that any 27r-periodic function f(x) can be expressed as the sum of a 
possibly infinite series of sines and cosines. The sum is also referred to as a Fourier expansion. 

OO 

f(x) = ao + ^(a*: cos kx + hk sin kx) 

k=l 
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The coefficients ao aj, and bk are calculated by 

1 

1 

(^k = - f{x) cos{kx)dx 

^ Jo 

1 

bk = - f{x)sin{kx)dx 

^ Jo 

The Fourier Transform’s ability lies in its abihty to analyze a signal in the time domain for its 
frequency content. The transform works by first translating a function in the time domain into 
a function in the frequency domain. The signal can then be analyzed for its frequency content 
because the Fourier coefficients of the transformed function represents the contribution of each 
sine and cosine function at each frequency. 

However, the big disadvantage of a Fourier Expansion is that it has only frequency resolution 
and no time resolution. This means although we might be able to determine all the frequencies 
present in the signal, we do not know when they axe present. This is because sines and cosines 
which comprise the bases of Fourier transform, are non-local and stretch out to infinity. They 
are therefore very poor in approximating data with sharp discontinuities. 

Before moving to the next section, lets briefly discuss about stationary and non-stationary 
signals. A stationary signal is one whose frequency content does not change over time. So, in 
case of stationary signals, one does not need to know at what times which frequency components 
exist, since all frequency components exist at all times. But, this is not the case with non- 
stationary signals. Here, the frequency content changes over time. Some frequencies which are 
present in a particular time instance, may not be present later. Thus, finding the frequency 
components and their time of occurrence in a non-stationary signal becomes a challenging job. 

3.2.2 Short Time Fourier Transform 

The problem with Fourier transform was that it did not work for non-stationary signals. Now, 
can we treat some portion of a non-stationary signal as stationary? If this region where the signal 
can be assumed to be stationary is too small, then we look at that signal through narrow window, 
narrow enough that the portion of the signal seen from the window is indeed stationary. This 
approach ended up with a revised version of the Fourier transform, called Short Time Fourier 
Transform (STFT). 

There is only a minor difference between STFT and FT. In STFT, the signal is divided into 
small enough segments, where these segments of the signal can be assumed to be stationary. 
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at the very beginning of the signal T^fe^the'^'^d ^"7 ™''™ 
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then multiplied By doinu this onl ,, V' ^be window function and the signal are 
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h f ,• ^ Signal, whose FT is to be taken The result of this 
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s a lonary, as assumed, then there will be no problem and the obtained result will be a true 

I” rr"‘““7 be shifting 

FT of the prld^t” Th- '"““d ' *“ ^ ™‘b ‘be signal, and taMng the 

windo^by T1 7" “““ ‘be signal is reached shifting the 

window by tl seconds at each step. The foUowing figure explains this process 



Figure 3.3: Short Window Fourier Transform 


The following definition of the STFT summarizes the above explanation : 

STFT{t', f) = J[x(t)v}*{t - 

Here, x(t) is the signal itself, w(t) is the window function, and * is the complex conjugate. One 
ca.n see that the STFT of the signal is nothing but the FT of the signal multiplied by a window 
unction. Here, for every t’ (time) and f (frequency), a new STFT coefficient is computed. 

So, now we have a true time-frequency representation of the signal. We not only know what 
frequency components are present in the signal, but we also know where they are located in time. 
But there are problems with STFT too. The problem has something to do with the width of the 
window function that is used. In the last section we have seen that FT has no resolution problem 
e frequency domain, i.e. we know exactly what frequencies exist. And the time resolution in 

■prp . 

zero, since we have no information about time. What gives the perfect frequency resolution 
in FT IS the fact that the window used in the FT is its kernel, the sine and cosine function, which 
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lasts at all times from -oo to +co. Now, in STFT, our window is of finite length, thus it covers 
only a portion of the signal, which causes the frequency resolution to get poorer. That means 

we no longer know the exact frequency components that exist in the signal, but we only know a 
band of frequencies that exist. 

^ Thus here comes a trade-off. If we use a window of infinite length, we get the FT, which 
gives perfect frequency resolution, but no time information. Furthermore, in order to obtain 
the stationarity, we have to have a short enough window, in which the signal is stationary. The 
narrower we make the window, the better the time resolution, and better the assumption of 

stationarity, but poorer the frequency resolution. The wavelet transform solves this trade-off 
problem. 

3.2.3 Wavelet Transform 

A wavelet allows one to do multi-resolution analysis, which helps to achieve both time and 
frequency localization. Here, the scale (or resolution, actually it is inverse of frequency) that 
we use to look at data plays a vital role. Wavelet algorithms process data at different scales or 
resolutions. If we look at a signal with a large ’’window”, we would notice gross (or averaged) 
features. Similarly, if we look at a signal with a small ’’window”, we would notice detailed 
features. Thus, by using varying resolution, it solves the problem that was there with STFT, 
due to the use of fixed window size (or resolution). 

The foilwing figure compares the relative frequency and time domain resolution of STFT and 
wavelet transform : 

\r- A- 


Frequency 


Time Time 

Figure 3.4: Comparative time and frequency resolution of STFT and wavelet transform 

At the core of a wavelet analysis procedure is the choice of a wavelet prototype function, called a 
mother wavelet. Temporal analysis is performed with a contracted, high-frequency version of the 
prototype wavelet, while frequency analysis is performed with a dilated, low-frequency version of 
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the same wavelet. Because the original signal can be represented in terms of a wavelet expansion 
(using coefficients in a linear combination of the wavelet transforms), data operation can be 
performed using just the corresponding wavelet coefficients. 

A Brief Introduction to Wavelets . 

The word ’’wavelet” literally means ’’small wave”. We have seen that the basis functions of 
Fourier Transform are sine and cosine waves, which extend over the entire time axis from -oo to 
+ 00 . That is the reason why FT can not provide time resolution. But, this is not the case with 
wavelets. Wavelets are localized waves and they extend not from -oo to -f-oo but only for a finite 
duration. So, they can provide both time and frequency resolution. 

Wavelets were first introduced in the work of A. Haar (1909). One property of the Haar wavelet 
is that it has compact support^ which means that it vanishes outside of a finite interval. But, 
Haar wavelets are not continuously differentiable. This somewhat limits the application of Haar 
wavelet. 

In the 60’s and 70’s R. Coifman did some study on wavelets. Later in 1980, Grossman and 
Morlet defined wavelets in the context of quantum physics, in 1985, Stephen Mallat used wavelet 
for digital signal processing. He discovered some relationship between quadrature mirror filters, 
P 3 rramid algorithms and orthonormal wave bases (we discuss later about these). Inspired by 
this work, Y. Meyer constructed the first non-trivial wavelets. Unlike Haar wavelet, the Meyer 
wavelets are continuously differentiable; however they do not have compact support. A couples of 
year later, Ingrid Daubechies used Mallat’s work to construct a set of orthonormal basis functions 
that are most elegant. These functions are mostly used in today’s wavelet applications. In our 
work, we use Daubechies’ Wavelets. 

Continuous Wavelet Tranform 

Continuous wavelet transform can be formally written as : 

7(s,t) = j f{tWs,r{^)dt 

The * denotes complex conjugation. This equation shows how a function f(t) is decomposed into 
a set of basis functions called wavelets. The variables s and r, scale and translation, are 

the new dimensions after the wavelet transform. 

The wavelets are generated from a single basic wavelet the so-called mother wavelet^ by 
scaling and translation. 
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Here, s is the scale factor, r is the translation factor and the factor s is used for energy 
normalization across the different scales. 


It should be noted that in the above equations the wavelet basis functions are not specified. 
This is the main difference between the Fourier transform and the wavelet transform. The theory 
of wavelet transform deals with the general properties of wavelet. Thus it defines a framework, 
based on which one can design the wavelet he wants. 

Discrete Wavelet Transform 

The continuous wavelet transform (CWT) described in the last section has redimdancy. CWT is 
calculated by continuously shifting a continuously scalable function over a signal and calculating 
the correlation between them. It is clear that these scaled functions will be nowhere near an 
orthonormal basis and the obtained wavelet coefficients will therefore be highly redundant. To 
remove this redimdancy Discrete Wavelet Transform (DWT) is used. In DWT the scale and 
translation parameters axe chosen such that the resulting wavelet set forms an orthogonal set, 
i.e. the inner product of the individual wavelets V’s.r are equal to zero. 


Discrete wavelets are not continuously scalable and translatable but can only be scaled and 
translated in discrete steps. This is achieved by modifying the wavelet representation as 


1 , ft 


V5o*- \ So 

Here s and r are integers and So > 1 is a fixed dilation step, tq is the translation factor and it 
depends on the dilation step. The effect of discretizing the wavelet is that the time-scale space 
is now sampled at discrte intervals. We generally choose So = 2 so that the sampling of the 
frequency axis corresponds to dyadic sampling. For the translation factor we generally choose tq 
= 1. In that case the previous equation becomes : 




t — t2‘ 
~~¥~ 


One of the most useful features of wavelets, specially for engineers, is that we can choose the 
defining coefficients for a given wavelet system to be adopted for a given problem. 


Prom a signal processing point of view, wavelet transforms are like filter banks. Applying 
wavelet transform has the same effect of applying a filter i.e. we get detail or average (smoothed) 
component of the original signal. So, the wavelet coefficients corresponding to a particular 
wavelet, are called wavelet filter coefficients. The filter coefficients are placed in a transformation 
matrix, which is applied to a raw data vector. The coefficients are ordered using two dominant 
patterns, one that works as a smoothing filter ( like a moving average ) and one pattern that 
gives the detailed information. The following example will simplify the idea. 
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Consider a filter with four coefficients, Ci, • • • , C4. In this case, the transformation matrix 
becomes the following, which acts on a vector of data. 


Cl 

C 2 

C 3 

C 4 

0 

0 ••• ■ 

di 

(^2 

dz 

£^4 

0 

0 

0 

0 

Cl 

C2 

C 3 

C 4 ••• 

0 

0 

di 

d 2 

dz 

^4 • • • 


• 


■ 




The first row generates one component of data convolved with filter coefficients Ci, • • - , C4. Like- 
wise the third, fifth and other odd rows. The even rows perform a different convolution, with 
coefficients di, - • • ,d4 (which are generated using Ci, • • • , C4). The action of the matrix, overall, 
is thus to perform two related convolutions, then reduce each of then by half. 

The filter Ci, • • • , C4 acts as a smoothing filter, something like a moving average of four points. 
And the filter di, • • ■ , ^4 gives the details. The cfs are so chosen that, they give a zero response to 
a sufficiently smooth data vector. These two orderings of the coefficients are called a quadrature 
mirror filter pair from the signal processing point of view. In the next chapter we will explain 
the implementation of the above process in more detail. 

Daubechies Wavelet 

There are several types of wavelets like ; Haar wavelet, Daubechies wavelet, Koifmann wavelet 
etc. The different wavelet families make different trade-offs between how compactly the basis 
functions are localized in space and how smooth they are. Daubechies wavelet [4, 5] seems most 
suitable for applications which deal with physiological signals like heart sounds etc. So, we use 
Daubechies wavelets in our work. The following figure ^ shows one Daubechies mother wavelet. 
The inset figure shows its fractal nature. 

Daub4, Daub6 and DaubS : 

Within each family of wavelets (e.g. Daubechies family) are wavelet subclasses distinguished 
by the number of coefficients and by the level of iteration. Often wavelets are classified within 
a family by the number of vanishing moments. This is a mathematical relationship that the 
coefficients must satisfy, and is directly related to the number of coefficients. There are many 
Daubechies wavelets like this. Among them most common are Daubf, DaubS, DaubS with 4, 6 
and 8 coefficients respectively. 

3.3 Feature Selection 

Generally, the features selected in the previous step, are huge in number, often in the order of 
thousands. Also they have much redundancy in the sense that many of them do not carry much 

^http://www.amara.com/IEEEwave/IEEEwavelet.htinl 
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Figure 3.5; Daubediies mother wavelet 


meaning and do not contribute in the decision of classification. So, they increase the computation 
cost without contributing to categorization. Hence, it is better to remove those features. Thus, 
the feature selection step deals with identifying a smaller number of meaningful features that 
best represent a given pattern without much redundancy. 

There are many techniques that are used for selection. Some of these axe; 

• Discarding the values that axe insignificant (small or zero). Generally a threshold level is 
chosen and all values smaller than the threshold axe discarded. 

• Choosing the top m large values from a set of n values. 

3.4 Classification 

Classification is the step where a specific pattern is assigned a specific class label according to 
the characteristic features selected for it. Many techmques are available for classification. 

• Neural Network 

• Decision Tree 

• k - Nearest Neighbor (kNN) 

• Case-Based Reasoning 

We have used neural network in our system because of it’s suitabihty over other techniques. The 
problem with decision tree is it’s sensitivity to noise. This results in overfitting and hense wrong 
generalization. Case-based reasoning deals wdth complex symbolic descriptions of samples or 
“cases”, which is not applicable to our data set. The problem with kNN is that, unlike ANN, 
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it assigns equal weight to each attribute. This degrades it’s performance when there are many 
irrelevant attributes in the data set. Another problem with kNN is that, it stores all the training 
samples and do not build a classifier until a new sample needs to be classified. This results in 
longer testing time. 

3.4.1 A Brief Introduction to Neural Networks 

Artificial neural networks (ANN) provide a robust, general and practical method for learning 
real- valued, discrete- valued or vector- valued functions from examples (or samples) . While ANNs 
are loosely motivated by biological neural systems, there are many complexities in biological 
neural systems that are not modeled by ANNs. 

Roughly speaking, a neural network is a set of connected input/output units organized into 
layers, the geometry and functionality of which have been likened to that of the human brain. 
Each connection between the miits has a weight associated with it. During the learning phase, 
the network learns by adjusting the weights so as to be able to predict the correct class label of 
the input samples. The following figme depicts a general neural network system: 



Figure 3.6: A typical neural network structure 


Neural network involves long training time and therefore are more suitable for applications 
where this is feasible. They also require a number of parameters that are typically best deter- 
mined empirically, such as network topology etc. The biggest advantage of neural network is the 
fact that they have high tolerance to noisy data. 

The simplest Neural Networks are based on a Perceptron unit. But the problem with a per- 
ceptron is that a single perceptron can only express linear decision surfaces. 
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Multilayer Network and the Back-Propagation Algorithm 

To learn non-linear decision surfaces, a new model is adopted. This is a multilayer neural network 
together with a backpropagation learning algorithm. Often a sigmoid unit is taken as the building 
block of this model. A sigmoid unit is very much like a perceptron, but based on a smoothed, 
differentiable activation function. It first computes a linear combination of its input, then applies 
the activation function to the result. The output o can be written as 

o = (t{w.x) 

where 


a is called the sigmoid function and its output ranges between 0 and 1, increasing monotonicaUy 
with its input. The following figure shows a sigmoid unit. 



Figure 3.7: A sigmoid unit 


Figure 3.8 shows a multilayer feed-forward network. The input corresponds to the attributes 
measured for the training example. The inputs are fed simultaneously into a layer of units 
making up the input layer. The weighted output of these units, are in turn, fed simultaneously 
to a second layer, called hidden layer. The hidden layer’s weighted output can be input to another 
hidden layer, and so on. The number of hidden layers is arbitrary, although in practice, usually 
only one is used. The weighted output of the last hidden layer are fed to the output layer, which 
emits the network’s prediction for the given samples. The network is called feed-forward because 
none of the weights cycles back to an input unit. 

It should be noted, that, though the number of units in the input layer is determined by the 
number of input features and number of units in the output layer is determined by the number 


22 


of possible outcomes, there is no clear rule for determining the number of units in the hidden 

layer. So a trial and error method is adopted and the number that gives the best accuracy is 
chosen. 



Figure 3.8: A multilayer feed-forward network 



Learning Algorithm 

Backpropagation learns by iteratively processing a set of training samples, comparing the net- 
work’s prediction for each sample with the actual known class label. For each training sample, the 
weights are modified so as to minimize the mean square error between the network’s prediction 
and the actual class. The modifications are made in the backward direction, i.e. from the output 
layer, through the hidden layer, to the input layer. Thats why they are called ’’backpropagation”. 

The following is the backpropagation algorithm for the feed forward network containing two 
layers of sigmoid units [10]. It uses gradient descent as the training rule. 

Here, each training example is a pair of the form (x, t), where x is the vector of network input 
values, and t is the vector of target network output values, rj is the learning rate (a small positive 
number, say 0.01). riin is the number of network inputs, Uhidden is the number of units present 
in the hidden layer and n^t is the number of output units. The input from unit i into unit j is 
denoted as Xji, and the weight from unit i to unit j is denoted as Wji. 
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• Create a feed-forward network with Uin inputs, Uhidden hidden units and Uout output units. 

• Initialize all network weights to small random numbers. 

• Until the terminating condition is met, Do 

a) For each {x, i) in training-example, Do 

* Propagate the inputs forward through the network: 

1. Input the instance x to the network and compute the output Oj of every 
unit j in hidden or output layer of the network. 

Propagate the errors backward through the network: 


2. For each network output unit k, calculate its error 5k 

Ok) 

3. For each hidden imit k, calculate its erroTterm 5h 

= Oh{l — Oh) ^ VJkh^k 
k£outputs 

4. Update each network weight wji 


where 


Wji = Wji + Awji 


~~ TjSjXji 


The commonly used terminating conditions are : 

• when the classification error or the percentage of samples misclassified in the previous 
iteration is below some threshold. 

• when a fixed number of iterations have happened. 


24 




Chapter 4 

Implementation and Results 


In the previous chapter, we have talked about various building blocks (or stages) of a classification 
system. Here, we will briefly describe how those stages have been implemented for the heart sound 
classification problem. The following figure describes the block diagram of our classification 
system: 


|h|hIh 



Figure 4.1: Block diagram of our classification system 


4.1 Display Subsystem 

We have implemented the whole system in Java to make it platform independent. The system 
displays the phonocardiogram of the heart sound. It also has other options like Play, Show Cycles 
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^ Find Features and Predtct Disease. The following figure shows the phonocardiogram generated 
by our system and the various buttons for choosing the options. 




Load 


Play ^ 

Stop 


Show Cycles 

,.:.nn£l Features 


Predia Disease 



IL 



Figure 4.2: The phonocardiogram generated by our system 


First, a heart sound file ( right now, we are dealing with uncompressed sound file formats, 
like .wav, .au and .aif ) is selected. Then our system reads the data firom the file, and plots the 
data in the phonocardiogram. If the “Play” option is chosen, we can hear the sound of heart 
beats and a cursor moves over the phonocardiogram, in synchronization with the sound. This is 
to aid the doctors (or any user), so that they can visualize and hear the sounds simultaneously 
and repeatedly. The “Show Cycles” option shows the cycles present in the soimd sample, with 
some separating marks. When “Find Features” option is chosen, it extracts some features firom a 
single cycle of heart sound and writes them in a file for further processing. The “Predict Disease” 
button is used for predicting the category of the particular sound sample. In the next section, 
we elaborate on how these options work. 

4.2 Diagnostic Subsystem 

The diagnostic subsystem in implemented in several stages. Here we discuss them one by one. 
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^ 4.2.1 Data Denoising 

The recorded heart sounds often have noise. It is necessary to enhance the signal-to-noise ratio 
before further processing. Here we adopt a simple data smoothing algorithm called moving 
average algorithm. 


Using this, an array of raw (noisy) data [yi, y^, — , can be converted to a new array of 
smoothed data. The smoothed point” {yk)smooth is the average of an odd number of consecutive 

2n+l (n=l, 2, 3, ..) Pomts of the raw data ^/fe_„,yfe_„+l,••.,1/Jt-l,2/fe,2/fe+l,•■•,yJfc+n-l,2/A;+n• 

Mathematically, 


iVk) 


smooth 


=i: 


Vk+i 
2n + l 


The odd number 2n+l is usually called the filter width. The greater the filter width the more 
intense is the smoothing effect. This operation is depicted in the figure 4.3. 

In this example the filter width is 5. The first five raw data (black squares in second figure) 
within the rectangle (moving window) are averaged and their average value is plotted as smoothed 
(grey square) data point 3. The rectangle is then moved one point to the right and points 2 
through 6 are averaged, and the average is plotted as smoothed data point 4 (third figmre). 
Similar process is carried out for rest of the points. Finally we get a set of smoothed points (last 
figure). This procedure is called a 5-point unweighted smoothing. 


The signal-to-noise ratio may be further enhanced by increasing the filter width or by smooth- 
ing the data multiple times. However, with the increase of window width, information may be 
lost or distorted because too much statistical weight is given to points that are well removed 
from the central point. To overcome this drawback, we give less weightage to the further points 
and more weightage to the points near the central point. 

4.2.2 Cycle Finding 

Our approach extracts each cycle in the sample before , the classification step. A normal heart 
cycle consists of some high activity region, due to the 1st heart sound (SI), followed by a low- 
activity region. We call this low activity region as silence period. Then again there is a high 
activity region, due to the 2nd heart sound (S2), followed by another silence period. And this 
is repeated throughout the whole heart sound sample, in a cyclic fashion. Figure 4.4 shows this 
clearly. 

There may be more sound components like third and fourth heart sound (S3 and S4) and 
corresponding silence periods, but the pattern remains the same from cycle-to-cycle. It is also 
noticed that the respective durations of these silence periods remains almost same from cycle- 
to-cycle. We exploit this property to find the cycles. 
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Figure 4.3: Moving average algorithm 

Using Silence Period : 

Let us call the silence period between S2 and SI as alpha and that between SI and S2 as beta. If 
there are more heart sounds componets (or murmiurs) like S3 and S4, we would get more silence 
periods like gamma^ delta etc. For the time being let us assume that only SI and S2 are present, 
and hence, only alpha and beta are present. There may be many cycles in a heart sound sample. 
So, alphas and betas will also be many in number. Initially, what we do, is to find all the silence 
periods present in the sample. All those sample points whose absolute data value falls below a 
threshold, are considered to be part of a silence period. Then, we cluster those silence period 
lengths using a clustering algorithm (hierarchical clustering algorithm). Ideally, all alphas are 


28 





Figure 4.4: The alpha and beta regions in heart sound 


grouped into one cluster and all betas into another. Similar is the case if garfima, delta are present 
there. 

In the next step, we find the longer among alpha and beta. It has been noticed that, in most 
of the cases alphas are much longer than beta, alphas are typically 1 /2 of the cycle period, where 
betas are around 1/3 of the cycle. Next, we parse the whole sample data once again. Whenever 
a silence period is found, we compare its length with the alpha (or beta, if alpha is not larger) 
found in the clustering step. If these two are almost equal we label that silence period as a alpha 
region (or beta). 

In the next step, we compare the distances between the starting of each alpha region and the 
starting of the next alpha. If they are almost equal, then we have found the cycles and the cycle 
durations axe the above distances. 

Though in most of the cases, this technique works, but it fails in some cases. These situations 
occur, when noise is very high (of high amplitude) or some unknown spike is found in between 
a silence period (it divides the silence period, and we get a wrong silence period value) or when 
alpha and beta are almost equal in length (then, alpha and beta would fall in same cluster and 
we can’t differentiate between them). In these cases, we go for the next technique. 

Using Signal Cluster : 

This step is much like the previous step. The only difference is that, here, instead of clustering the 
silence period, we cluster the regions with high activity (SI, S2 etc.). The rest of the procedure 
remains the same. Here again, we check if all the “foimd cycles” are of almost equal length. If 
yes, then we have found the cycles. Otherwise, this technique fails too and a cycle can not be 
found. By experimentation we found that the probability that both techniques fail, is very low. 
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4.2.3 Feature Extraction 


As mentioned in the previous chapter, we use discrete wavelet transform for feature extraction. 
There is no fixed guideline for choosing the appropriate wavelet. The choice of the appropriate 
decomposition wavelet depends upon the physiological signal and is chosen empirically. Earlier 
work suggests that, Daubechies wavelets are best for signals like heart soxmd, EGG etc and daub4, 
daub6, daubs axe most frequently used among them. So, we use daub4, daub6 and daubS wavelet 
coeflacients of the Daubechies family in our work. The same experiment was carried out for these 
3 sets of coefficients to see which one gives the best performance. 

At first, we sampled 1024 data points from one heart cycle. Then matrix multiplication is done 
between wavelet coefficient matrix and the data vector (as described in the previous chapter). 
The matrix is applied using a hierarchical algorithm, called a pyramidal algorithm. The wavelet 
coefficients are arranged so that odd rows contain an ordering of wavelet coefficients that act as 
the smoothing filter, and the even rows contain an ordering of wavelet coefficients with diflferent 
signs that act to bring out the details. The matrix is first applied to the original, full-length 
vector. Then the vector is smoothed and reduced by half and the matrix is appfied again. Then 
the smoothed, halved vector is smoothed, and halved again, and the matrix appfied once more. 
This process continues imtil one smoothed data and one detailed data remain. Thus, each matrix 
application brings out a higher resolution of the data while at the same time smoothing the 
remaining data. The output of the DWT consists of the remaining “smooth” component, and all 
of the accumulated “detail” components. The above approach yields the required multiresolution 
analysis. Figure 4.5 explains the above procedure. Here h(n) acts as the smoothing filter and 
g(n) brings out the detail components. 

In the above computation we used the coefficients shown in table 4.1. The coeflficients were 
computed by Daubechies [4, 5]. 

4.2.4 Feature Selection 

The above wavelet decomposition procedure returns as many as 1024 features. This huge niunber 
of features will be computationally expensive during the training phase. So, using some feature 
selection techniques we can discard some features. First, we discard the 4 levels with shortest 
scale (high frequency, i.e. detailed values). This step leaves us with 64 features and substantially 
simplifies the neural network for classification. This also reduces noise (as noise is captured in 
the high frequency decomposition of the wavelet). Next, we take statistics over the set of the 
wavelet coefficients and use them as features. We take the following statistics : 

1. The mean of the absolute value of the coefficients in each subband. 

These features provide information about the frequency distribution of the heart soxmd. 
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Figure 4.5: Wavelet Decomposition by pyramidal algorithm 


2. The standard deviation of the coefficients in each subband. 

These features provide the information about the amount of change in the frequency dis- 
tribution. 

3. Ratios of the mean values between adjacent subbands. 

This feature also provides the information about the frequency distribution. 

This results in another 14 features. We use these 78 features (64-1-14) for classification. 

4.2.5 Classification 

We use a standard two-layer, fully connected feed-forward network with one hidden layer for 
classification. 
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Type 


Coefficients 


0 

0.4829629131445341 

D4 

1 

0.8365163037378079 


2 

0.2241438680420134 


3 

-0.1294095225512604 


0 

0.3326705529500825 


1 

0.8068915093110924 

D6 

2 

0.4598775021184914 


3 

-0.1350110200102546 


4 

-0.0854412738820267 


5 

0.0352262918857095 


0 

0.2303778133088964 


1 

0.7148465705529154 


2 

0.6308807679398587 

D8 

3 

-0.0279837694168599 


4 

-0.1870348117190931 


5 

0.0308413818355607 


6 

0.0328830116668852 


7. 

-0.0105974017850690 


Table 4.1: Daubechies’ Wavelet Coefficients 


Training 

The 78 features obtained from each sample are fed to the neural-net for training. So, the nximber 
of units in the input layer is 78. We have five categories of heart sounds {aortic regurgitation, 
aortic stenosis, mitral regurgitation, mitral stenosis and normal heart sound). So, the output 
layer has 5 units. Now, as mentioned earlier, there is no fixed rule for choosing the number of 
imits in the hidden layer. So, we chose the number which gives best results. For our system, we 
found 78 (the no of input nodes) as the most appropriate size for the hidden layer. 

We use the back-propagation algorithm for training the network. However, values of some 
parameters were chosen by experimentation. These parameters were again chosen empirically, 
based on the results they gave. Table 4.2 shows the chosen values. 

- At the end of the training, we get the weight matrix. The matrix is stored in a file, for use 
during the test process. 

The time required for the above training process is around 70 seconds. 


32 




Parameter 

Value 

V 

0.01 

epoch 

2000 

'^input 

78 

"^hidden 

78 

'^output 

5 


Table 4.2: Different Parameters 


Testing 

Testing on an unknown sample is done by extracting the 78 features from the test sample and 
feeding them as inputs to the trained network. The feed-forward computation is performed, 
which gives 5 outputs at the 5 output nodes. We choose the node with the highest output value 
as the prediction. The time required for testing is 200-400 milliseconds. 

We also use a calculated value called confidence of the prediction. If the difference between 
the highest output value and the second highest output value is very small (smaller than a 
threshold), we say that the confidence of the prediction is low and we give both the highest and 
second highest output, as the first and second preference for the predicted disease. 

4.3 Results 

■We collected 56 heart sound samples of five categories {aortic regurgitation, aortic stenosis, mitral 
regurgitation, mitral stenosis and normal heart sound. All samples were downloaded from the 
Internet [23, 24, 25, 26, 27]. To evaluate the performance of our system we used 80-20 method. 
80% of the samples, randomly chosen from the 56 samples, were used for training and the rest 
20% samples were used for testing. This process was repeated 50 times. The average classification 
accuracy we obtained for daub4, daub6 and daubS wavelets are shown in the following table. 


Wavelet 

Train Samples 

Test Samples 

Accuracy 

daub4 

daubs 

daubs 

42 

i 

14 

77% 
77.14 % 
81.86 % 


Table 4.3: Classification Accuracy 


We also noticed that, most of those predictions that were wrong, were of ’’low” confidence. 
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And the right predictions were present as the second option in these cases. 

Prom the result, it is clear that the performance of daubS wavelet is better than the other two. 
So, we use daubS in the final diagnosis system. 
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Chapter 5 

Conclusion and Future Work 


5.1 Conclusion 

In our work, we have developed an intelligent heart disease classification system for the prediction 
of possible heart disease by analysing the heart sound signal. The task of feature extraction was 
carried out by using wavelet packet decomposition for multi-scale analysis, while the classification 
was carried out by a feed-forward neural network with back-propagation learning algorithm. We 
obtained a result of aroxmd 81% accuracy in classification. This is much less than the results 
obtained by other works that we reviewed. Todd R. Reed, Nancy E. Reed and Peter Fkitzson 
[17] achieved 100% accuracy level in classifying heart sounds in 5 categories. Ibrahim Turkoglu, 
Ahmet Arslan and Ilkay Erdogan [3] obtained almost 94% accuracy in detecting mitral valve 
diseases. The system developed by Onsy Abdel-Afim, Naddar Hamdy and Mohammed A. E. 
[16] gave 95% acciuacy in the diagnosis process. So, we see that the performance of our system 
is not very satisfactory. But we must say that the result of this work is promising, considering 
the fact that sufficient number of training samples were not there to train the system properly. 
And we automated the cycle extraction process, which failed to detect the actual cycles in some 
cases. We also faced the following problems, that led to this unsatisfactory performance in many 
ways. 

5.1.1 Problems faced 

1. Position of stethoscope during recording gives good information about the heart-sound. 
But it was not provided in oru samples. 

2. In some cases, the noise present in the samples was very high. 

3. The number of samples were not adequate to train the neural net properly. 

4. Information like age, sex of the particular patient were missing. This information is im- 
portant for a proper diagnosis. 
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5.1.2 Advantages of our system 

The system that we have developed, has following advantages. 

1. It is robust enough, to handle noise. 

2. The cycles were extracted by the system. So, no need for manually segmenting the cycles 
present in the sound sample. 

3. Our system is rapid, because the time it takes for the testing step, is very low ( 200-400 
milliseconds ). 

4. It is fully automated, and requires no special skill to handle it. 

5. The implementation cost is very less, compared to other techniques (like EGG etc). 

5.1.3 Drawbacks of the system 

1. Our system may fail if the noise level is very high in the samples. 

2. If the cycle-length (in time) and signal-pattern varies a lot from cycle to cycle (which is a 
very rare case) then our cycle-finding technique would fail. 

5.2 Future Work 

After getting this moderate accuracy of the system, we should try for achieving higher accuracy, 
if possible near 100%. For this we should make our cycle-detection technique absolutely fail- 
proof. Also we need better methods for handling high noise. We also plan to investigate the 
use of other wavelet families. Another interesting direction is combining features from different 
analysis techniques. 

Another approach, that can be developed, and hopefully would give much better result, is to 
make a system that deals with a set of techniques like heart sounds from stethoscope, EGG, 
Doppler ultr^oimd etc and predict the disease based on the combined result from each of the 
techniques. 
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